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Abstract — The Time Projection Chamber of the ALICE ex- 
periment at the CERN Large Hadron CoUider features highly 
integrated on-detector read-out electronics. It is following the 
general trend of high energy physics experiments by placing the 
front-end electronics as close to the detector as possible — only 
some 10 cm away from its active volume. Being located close to 
the beams and the interaction region, the electronics is subject to 
a moderate radiation load, which allowed us to use commercial 
off-the-shelf components. However, they needed to be selected 
and qualified carefully for radiation hardness and means had to 
be taken to protect their functionality against soft errors, i.e. 
single event upsets. 

Here we report on the first measurements of LHC induced 
radiation effects on ALICE front-end electronics and on how 
they attest to expectations. 

Index Terms— single event upsets, TPC, ALICE, CERN. 

I. Introduction 

Radiation induced effects on the electronics of high energy 
physics experiments are gaining importance due to the ever 
increasing density of electronics and their placement into 
zones of high particle flux. When energetic particles cross 
the electronics they may release a significant amount of 
charge along their paths leading to diverse erratic behaviours, 
termed single event effects (SEEs). Amongst them are single 
event upsets (SEUs), which refer to flip-flops/memory cells 
changing their state (0 — ?► 1 or 1 —> 0). These effects are of 
major importance, even in areas of moderate radiation load, 
because they may accumulate and lead to system failures if 
no mitigation is applied. The estimation of SEU rates and the 
protection of critical bits is therefore an important task in the 
design process of detector read-out electronics. 




The ALICE Time Projection Chamber (TPC) is the main 
tracking detector of the ALICE experiment at the CERN Large 
Hadron CoUider (LHC) |[T], Q. It comprises a hollow cylinder, 
placed concentrically around the beam-line, with an active 
volume constrained by 84.8 < r < 246.6 cm (radial) and 
\z\ < 249.7 mm (along the beam line, see Fig.[T]i. The detector 
is equipped with multi-wire proportional chambers that are 
mounted on the end-plates and uses a high granularity pad 
read-out with 557,568 pads. 

The large number of active channels required to mount 
the electronics for digital pre-processing and temporary data 
storage onto the detector; it is placed only some 10 cm away 
from the end-plates. The signal amplification, digitisation and 
processing is performed by two custom ASICs: the pre- 
amplifier and shaping amplifier (PASA, 0.35 fim) and the 
ADC and signal processor (ALTRO, 0.25 /im). A total of 
34,848 ALTROs and 34,848 PASAs are utihsed to read out 
the detector 

While designing electronics to be operated that close to the 
interaction region, radiation tolerance has been anticipated to 
be of importance since the very first design stages. Detailed 
simulations showed that, at its position, the radiation load on 
the electronics is on the one hand too high to use random off- 
the-shelve electronics, but on the other hand too low to justify 
the cost of radiation hardened components. Instead, lots of 
effort was put on a) qualifying commercial and custom made 
components for their radiation hardness in terms of total dose 
and b) implement mitigation for single event effects f3], H. 

Here we report on the first direct observation of SEUs in 
the TPC front-end electronics, i. e. in the ALTRO memories, 
which are attributed to particles emerging from collisions. At 
this early stage of operation of accelerator and experiment, 
priority is given to understanding the detector signals in detail. 
This in particular implies that not all sophisticated features of 
the electronics for on-detector signal processing are switched 



TABLE I 

Distribution of the front-end electronics and bits masked in 

THE analysis. 

Partition Radial position FECs/sector Bits/TPC (masked) 



Fig. 1. Layout of tlie TPC (picture from [T]). 
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TABLE II 

Parameters of selected subsystems and their mitigation strategy. 



Component 


Part number 


Bits/TPC 


SEU cross-section 


Mitigation 


ALTRO 




29 • 


10« 


5.5 ■ 


10-1" cm^bit-i 


Hamming-encoded state machines 


BC 


Altera AP1K30TC 144-3 


0.92 


■ 109 




unknown 


none 


RCU 


Xilinx XC2VP7-6-FF672 


2.0 ■ 


109 


3.7- 


10-" cm^bit-l 


active partial reconfiguration 



on, which in turn allowed us to utilise them as a radiation 
monitor. It should be noted that the measurement thus does 
not interfere with the normal operation and data-taking of the 
detector at any stage. 

The electronics is mounted on 4,356 front-end cards (FECs), 
each one housing 8 ALTROs and 8 PASAs. They are dis- 
tributed over 2 sides ("A" and "C"), 18 sectors in azimuthal, 
and 6 partition in radial direction (Fig. [TJ- The number of FECs 
per partition is determined by the anticipated track density and 
the trapezoidal shape of the sectors (Tab. This vast amount 
of "unused" bits and spatial distribution of the electronics 
makes the TPC a unique device for a quantitative differential 
measurement of hadron flux. 

Apart from the read-out chips themselves, logic is added to 
monitor the physical parameters (temperatures, voltages, and 
currents) of each EEC ("board controller", BC), as well as, 
to steer the read-out process of each partition by a read-out 
control unit (RCU). These devices are based on SRAM FPGA 
and thus sensitive to SEUs as well. Although here we do not 
measure the SEUs in those, we are able to infer their number 
by scaling the measured cross-sections and number of sensitive 
bits. 

II. Anticipated radiation effects 

A. Radiation environment 

Extensive simulations of both the instantaneous flux and the 
integrated fluency for different particle species were carried 
out Q. Here we follow a simplified model, by assuming a 
flat pseudo-rapidity distribution ANch/Arj of primary charged 
particles N^h over the relevant values of pseudo-rapidity 77. 
For a given radial position r on the end-plate we obtain their 
flux (number of charged particles N^h per area S) as: 
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where Zq = ±263 cm is the z-position of the respective 
ALTRO chips. Furthermore we assume that the number of 
particles that cause SEUs is proportional via a factor A to 
this number We hereby neglect any effect due to beam- 
gas interactions or precise treatment of secondary particle 
production. Also, the containment of low energetic particles 
due to the magnetic field (solenoidal, 0.5 T in beam direction) 
is disregarded. 

Finally, the number of SEUs is proportional to the particle 
track length per volume rather than to the number of particles 
per area, which introduces another factor of -^/l + (r/Zo)^. 



This yields our final model: 
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The tests had been conducted with 7 TeV minimum bias 
proton-proton collisions for which one obtains dNch/dij{ri = 
0) « 6 15|. Running with heavy ions one assumes a factor 
of 100 more primary particles per collision |6|. In its final 
configuration the LHC will deliver proton-proton and heavy- 
ion collisions with approximate rates of up to 1 MHz and 
10 kHz, respectively. Both scenarios lead to mean time be- 
tween failures (the crucial parameter for mitigation strategies) 
of about a factor 10 smaller as compared to the current running 
conditions. 

Also very simplistic, the model correctly describes the quali- 
tative behaviour and the order of magnitude of the simulations. 
For the sake of simplicity and because the overall uncertainties 
of our measurement fall within that range, we follow this 
model. 

B. SEU cross-sections 

All components of the front-end read-out were characterised 
in test beams fl], (E] and qualified for radiation hardness. The 
measured SEU cross-section for the ALTRO memories was 
obtained to be 5.5 • lO^^"* cm^bit^^. A table summarising 
the number of bits and the respective cross-sections for the 
other selected devices in the read-out chain is given in Tab. [ll] 
Moreover the components were qualified in terms of total dose. 
Here only components withstanding a total dose of 20 times 
larger than the anticipated one were accepted in order to take 
into account the simulation uncertainties of total fluency. 

C. Implemented mitigation techniques 

Since physical shielding of the electronics against radiation 
is not possible for the TPC due to the layout of the detector, 
any radiation-related errors in the read-out electronics system 
need to be handled by mitigation techniques at architecture and 
circuit level. Depending on the anticipated likelihood of errors 
and their possible system-level impact, different mitigation 
techniques were chosen for the different components. This is 
summarised in Tab. |ll] and discussed hereunder. 

At the predicted error rates (below 1 bit per second), data 
corruption is of no big concern. The used data transfer protocol 
ensures that errors do not spread, such that at most one single 
channel in one single event gets affected by an SEU. The 
ALTRO chip, however, implements protection in the memory 
management and interface state machines. They are the most 



TABLE m 
Data sets with at least one SEU. 
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Fig. 2. Congelation of integrated number of collisions and number of 
SEUs within a measurement period and fitted proportionality. The error bars 
correspond to Icr counting errors. 



important state machines dealing with the on-chip data buffers 
and interface protocol respectively and any failure can lead 
to scrambling event data streams or even electrical clashes 
on the bus, irreversibly damaging the electronics. These finite 
state machines are protected against SEU based on Hamming- 
encoding of the state vectors. They are thus protected against 
effects of single SEUs (one bit-flip) and may report the case 
of a double bit-flip. In the latter case the correct state vector 
may not be recovered and the state machine goes back to idle 
state. Any Hamming error is tracked and reported in a status 
register. 

Another weak point are the configuration registers of the 
ALTRO. Their corruption can cause the data integrity prob- 
lems of single chips. Systematic reconfiguration during the 
data taking process of these registers is foreseen to keep such 
errors very localised in time. 

The EEC's board controller is implemented in a SRAM 
based FPGA. It is programmed from an on-board flash device 
at power up. In case of any malfunction due to radiation 
effects, it can be reprogrammed from the on board flash device 
to restore the configuration with a software command. 

The event read-out of each partition is steered by a read-out 
control unit (RCU), which is implemented in another SRAM 
based EPGA. Active partial reconfiguration is employed to 
protect its configuration memory. It is realised by using of a 
flash based support EPGA that communicates both with the 
selectMAP interface of the Xilinx device and to an on-board 
flash memory device. The latter stores needed configuration 
files for the Xilinx. 

HI. SEU MEASUREMENTS 

A. Procedure 

The measurement of SEUs, on which we report here, em- 
ploys currently unused memory cells of the ALTRO pedestal 
memories by monitoring bit flips, which is a widely adopted 
method in dedicated radiation monitoring systems (e. g. ||9|). 
The pedestal memories consist of 1024 10-bit SRAM cells 
for every ALTRO channel. There are 16 channels in each of 
the 34,848 ALTRO chips accounting for a total number of 
5.7 • 10^ bit available to monitor SEUs. 

The measurement was carried out as follows: 
(a) The memories were initialised with specific bit patterns 
(either "0101010101" or "1111111111"). 



(b) Right after being initialised, the contents of the memories 
were verified against the used pattern. This allowed to 
detect faulty parts of the system. 

(c) The detector was operated for some time. 

(d) The memories were read back and all locations were 
checked against their reference data obtained in step (b). 



ni 



A list of the conducted measurements is given in Tab, 
Their timing is constrained by the operation of accelerator and 
detector. A write and read cycle takes about 10 minutes and 
within that time no data taking may happen. Because beam 
time is precious, it was performed whenever there was a gap 
right before injection and right after dumping the beam. 

Because the anticipated number of errors is very low, 
a careful selection of "good" memories was essential for 
obtaining a high sensitivity and remove false-positives in the 
SEU detection. We had to mask channels for different reasons: 

• Single EECs were not operational due to technical prob- 
lems and needed to be kept off. 

• The memories of a few channels show erratic behaviour 
and reported random data. 

• Two partitions had to be excluded due to a too large 
number of bad channels and communication problems. 

A summary of available and masked bits per partition is given 
in Tab. U 

The number of SEUs is correlated to the total number of 
collisions that happened during a measurement interval. This 
number is obtained by integrating the instantaneous collision 
rate as reported by the VO detector of ALICE. It is run 
continuously and its values are reported to the LHC-wide 
database. 

B. Analysis 

To analyse the results in a quantitative way, all data sets 
were normalised to the respective number of concerned bits. 
This takes into account the number of masked channels, which 
were the same throughout the analysis of all datasets. 

The dependence of the number of SEUs on the number 
of collisions is shown in Fig. [2] It indicates proportionality 
and thus gives first the indication that SEUs are the origin of 
the observed bit errors. The proportionality constant of (1.2 ± 
0.1) • 10~^ per collision can directly be translated into an error 
frequency when multiplied by the collision rate. 
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(a) Accumulated single event upsets (black boxes indicate partitions not analysed). Disabled FECs or bits {< 10%) 
are not taken into account. 
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Fig. 3. 



(b) Normalised differential distributions. 
Accumulated single event upsets (integral over all measurements). The error bars correspond to Icr counting errors. 



Next, we analysed the spatial dependence of the effect, 
which is predicted by Eq. |2] The result is depicted in Fig. |3] 
which shows the integrated number of SEUs over all data 
sets. The absolute number of errors within each read-out 
partition, as well as normalised SEU probabilities per bit and 
collision, differential in radial direction and azimuthal angle, 
are provided. 

Albeit limited statistics, the data indicates a radial depen- 
dence (see Fig. |3jb)) as it is expected from a beam induced 
process (Eq. |2|. Also, Fig. [3|b) shows no correlation between 
the number SEUs and sector position, i. e. no azimuthal 
dependence, as expected. Even the indicated slight asymmetry 
between A- and C-side is expected from simulations and are 
due to a muon absorber placed in the C-side. 

Taking the cross-section from Tab. |ll] we obtain the propor- 
tional parameter A of Eq. [2]to be 4.5 ± 0.3, which, taken into 
consideration the discussion in Sec. |II-A[ is very reasonable. 

To finally rule out the possibility that the observed effects 
were not due to collision induced SEUs, but e. g. due to 
some electrical or thermal problem, the same measurement 
procedure was repeated at a technical stop (i. e. the LHC was 
off and thus did not deliver any beam). Thereby the operating 
conditions were chosen as close as possible to the situation 
with bit errors: the TPC was operated and took (noise) data 
at similar rates. In these conditions no errors were observed. 

IV. Conclusions 

We have presented the first quantitative measurement of 
single event upsets with ALICE. It was shown to comply with 
theoretical expectations quantitatively within an order of mag- 
nitude and qualitatively by resembling the radial dependence. 

The measurement does not only justify the effort put on de- 
signing the built-in protection of the system but also underlines 



its importance in view of the anticipated increase of particle 
flux in heavy-ion and high rate proton-proton collisions at 
LHC. Moreover, it clearly shows that already at this early 
stage of LHC operation SEUs became visible and should not 
be neglected. 
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